In view of the problems that the traditional general graph matching search is inefficient, and refractive index data cannot be positioned fast in large data environment, a distributed massive molecular retrieval model based on consistent Hash function was established. Combined with the characteristics of molecular storage structures, to improve retrieval efficiency of molecules, the continuous refractive index was discretized by fixed width algorithm to establish high-speed Hash index, and the distributed massive retrieval system was realized. The size of dataset was effectively reduced, and Hash collision was handled according to the visiting frequency. The experimental results show that, in the chemical data containing 200 thousand structures of molecules, the average time of this method is about five percent of the traditional general graph matching search. Besides, the model has the steady performance with high scalability. It is applicable to retrieve high-frequency molecules in accordance with refractive index under the environment of massive data.